165 research outputs found
Bayesian mixture labeling and clustering
Label switching is one of the fundamental issues for Bayesian mixture modeling. It
occurs due to the nonidentifiability of the components under symmetric priors. Without
solving the label switching, the ergodic averages of component specific quantities will be identical and thus useless for inference relating to individual components, such as the posterior means, predictive component densities, and marginal classification probabilities. In this article, we establish the equivalence between the labeling and clustering and propose two simple clustering criteria to solve the label switching. The first method can be considered as an extension of K-means clustering. The second method is to find the labels by minimizing the volume of labeled samples and this method is invariant to the scale transformation of the parameters. Using a simulation example and two real data sets application, we demonstrate the success of our new methods in dealing with the label switching problem
Mixture of Regression Models with Single-Index
In this article, we propose a class of semiparametric mixture regression
models with single-index. We argue that many recently proposed
semiparametric/nonparametric mixture regression models can be considered
special cases of the proposed model. However, unlike existing semiparametric
mixture regression models, the new pro- posed model can easily incorporate
multivariate predictors into the nonparametric components. Backfitting
estimates and the corresponding algorithms have been proposed for to achieve
the optimal convergence rate for both the parameters and the nonparametric
functions. We show that nonparametric functions can be esti- mated with the
same asymptotic accuracy as if the parameters were known and the index
parameters can be estimated with the traditional parametric root n convergence
rate. Simulation studies and an application of NBA data have been conducted to
demonstrate the finite sample performance of the proposed models.Comment: 28 pages, 2 figure
Fully Bayesian Logistic Regression with Hyper-Lasso Priors for High-dimensional Feature Selection
High-dimensional feature selection arises in many areas of modern science.
For example, in genomic research we want to find the genes that can be used to
separate tissues of different classes (e.g. cancer and normal) from tens of
thousands of genes that are active (expressed) in certain tissue cells. To this
end, we wish to fit regression and classification models with a large number of
features (also called variables, predictors). In the past decade, penalized
likelihood methods for fitting regression models based on hyper-LASSO
penalization have received increasing attention in the literature. However,
fully Bayesian methods that use Markov chain Monte Carlo (MCMC) are still in
lack of development in the literature. In this paper we introduce an MCMC
(fully Bayesian) method for learning severely multi-modal posteriors of
logistic regression models based on hyper-LASSO priors (non-convex penalties).
Our MCMC algorithm uses Hamiltonian Monte Carlo in a restricted Gibbs sampling
framework; we call our method Bayesian logistic regression with hyper-LASSO
(BLRHL) priors. We have used simulation studies and real data analysis to
demonstrate the superior performance of hyper-LASSO priors, and to investigate
the issues of choosing heaviness and scale of hyper-LASSO priors.Comment: 33 pages. arXiv admin note: substantial text overlap with
arXiv:1308.469
Nonparametric and Varying Coefficient Modal Regression
In this article, we propose a new nonparametric data analysis tool, which we
call nonparametric modal regression, to investigate the relationship among
interested variables based on estimating the mode of the conditional density of
a response variable Y given predictors X. The nonparametric modal regression is
distinguished from the conventional nonparametric regression in that, instead
of the conditional average or median, it uses the "most likely" conditional
values to measures the center. Better prediction performance and robustness are
two important characteristics of nonparametric modal regression compared to
traditional nonparametric mean regression and nonparametric median regression.
We propose to use local polynomial regression to estimate the nonparametric
modal regression. The asymptotic properties of the resulting estimator are
investigated. To broaden the applicability of the nonparametric modal
regression to high dimensional data or functional/longitudinal data, we further
develop a nonparametric varying coefficient modal regression. A Monte Carlo
simulation study and an analysis of health care expenditure data demonstrate
some superior performance of the proposed nonparametric modal regression model
to the traditional nonparametric mean regression and nonparametric median
regression in terms of the prediction performance.Comment: 33 page
Editor\u27s Preface and Table of Contents
These proceedings contain papers presented in the twenty-third annual Kansas State University Conference on Applied Statistics in Agriculture, held in Manhattan, Kansas, May 01 - May 03, 2011
- …